A scoring model for phosphopeptide site localization and its impact on the question of whether to use MSA.

نویسندگان

  • Juliana de S da G Fischer
  • Marlon D M Dos Santos
  • Fabricio K Marchini
  • Valmir C Barbosa
  • Paulo C Carvalho
  • Nilson I T Zanchin
چکیده

The production of structurally significant product ions during the dissociation of phosphopeptides is a key to the successful determination of phosphorylation sites. These diagnostic ions can be generated using the widely adopted MS/MS approach, MS3 (Data Dependent Neutral Loss - DDNL), or by multistage activation (MSA). The main purpose of this work is to introduce a false-localization rate (FLR) probabilistic model to enable unbiased phosphoproteomics studies. Briefly, our algorithm infers a probabilistic function from the distribution of the identified phosphopeptides' XCorr Delta scores (XD-Scores) in the current experiment. Our module infers p-values by relying on Gaussian mixture models and a logistic function. We demonstrate the usefulness of our probabilistic model by revisiting the "to MSA, or not to MSA" dilemma. For this, we use human leukemia-derived cells (K562) as a study model and enriched for phosphopeptides using the hydroxyapatite (HAP) chromatography. The aliquots were analyzed with and without MSA on an Orbitrap-XL. Our XD-Scoring analysis revealed that the MS/MS approach provides more identifications because of its faster scan rate, but that for the same given scan rate higher-confidence spectra can be achieved with MSA. Our software is integrated into the PatternLab for proteomics freely available for academic community at http://www.patternlabforproteomics.org. Biological significance Assigning statistical confidence to phosphorylation sites is necessary for proper phosphoproteomic assessment. Here we present a rigorous statistical model, based on Gaussian mixture models and a logistic function, which overcomes shortcomings of previous tools. The algorithm described herein is made readily available to the scientific community by integrating it into the widely adopted PatternLab for proteomics. This article is part of a Special Issue entitled: Computational Proteomics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Guessing in Multiple Choice Questions: Challenges and Strategies

Introduction: Guessing is one of the most challenging issues in multiple choice questions. Several strategies, such as negative scoring, have been suggested for preventing students from choosing the right answer just by chance. However, there is no general agreement on using such strategies. The aim of this study was to review the scoring methods which are used for reducing guessing, and evalua...

متن کامل

The Impact of Mozart Music on Translator Students' Performance and its Relationship with Students' Extraversion or Introversion Personality Traits

The present research aimed to investigate the effect of background Mozart Classical music on translator students' performance. In this study, the researchers focused not only on the relationship between music and translation but also on the relationship between music and personality traits. The main question this study tried to answer was whether using background music might enhance students' t...

متن کامل

Factors Affecting the Judging the Final Design of Architecture Students in Iranian Universities

Evaluation of architectural designs Judges the design. In this way, it measures the amount of variables in the design from the desired aspect and then judges it. Due to the important role of arbitration in the courses of architectural projects, if the type of arbitration is not clear, it will affect the appropriate atmosphere on arbitration, will cause unrelated demands related to educational g...

متن کامل

بررسی اثر پذیرش فناوری اطلاعات بر چابکی سازمان

This research is performed to answer the question “whether the acceptance of information technology influences organizational agility in national petrochemical industries corporation?” The statistical population was 505 persons. Using the "restricted population formula", the number of sample was determined to be 177 persons. After insuring its reliability and validity, a questionnaire was used ...

متن کامل

روشی جدید جهت استخراج موجودیت‌های اسمی در عربی کلاسیک

In Natural Language Processing (NLP) studies, developing resources and tools makes a contribution to extension and effectiveness of researches in each language. In recent years, Arabic Named Entity Recognition (ANER) has been considered by NLP researchers due to a significant impact on improving other NLP tasks such as Machine translation, Information retrieval, question answering, query result...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of proteomics

دوره 129  شماره 

صفحات  -

تاریخ انتشار 2015